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(57) Abstract 

This invention provides a method for cleaving target proteins from Caulobacter S-laycr protein under mild acid conditions. A 
fusion protein secreted by Caulobacter which includes a target protein and a Caulobacter S-laycr secretion signal may be cleaved at an 
aspartate-proline dipeptide without solubilizing the fusion protein. This method may be carried out while the fusion protein is in an insoluble 
aggregate which facilitates recovery of the protein. This invention also provides a method of preparing a DNA construct for expression of 
the fusion protein and a method of preparing the fusion protein. 



wo 00/04170 



PCT/CA99/00637 



CLEAVAGE OF CAULOBACTER PRODUCED 
RECOMBINANT FUSION PROTEINS 

FIELD OF DnVENTION 

This invention relates to the expression and secretion of recombinant fusion 
proteins from Caulobacter wherein a heterologous polypeptide is fiised with all or part 
of the surface layer protein (S-iayer protein) of the bacterium. 

BACKGROUND OF THE INVENTION 

Many bacteria assemble layers comiX)sed of .repetitiye, regularly aligned, 
proteinaceous sub-units on the outer surface of the cell. These layers are essentially 
two-dimensional paracrystalline arrays, and being the outer molecular layer of the 
organism, directly interface with the environment. In Caulobacter , the S-layer protein 
is synthesized by the cell in large quantities and the S-layer completely envelops the cell 
and thus appears to be a protective layer. 

Caulobacter are natural inhabitants of most soil and freshwater environments 
and may persist in waste water treatment systems and effluents. The bacteria alternate 
between a stalked cell that is attached to a surface, and an adhesive motile dispersal cell 
that searches to fmd a new surface upon which to stick and convert to a stalked cell. 
The bacteria attach tenaciously to nearly all surfaces and do so without producing the 
extracellular enzymes or polysaccharide "slimes" that are characteristic of most other 
surface attached bacteria. Caulobacters have simple requirements for growth. The 
organism is ubiquitous in the environment and has been isolated from oligotrophic to 
mesotrophic situations. They are known for their ability to tolerate low nutrient level 
stresses, for example, low phosphate levels. 

All of the freshwater Caulobacter that produce an S-layer are similar and have 
S-layers that are substantially the same under election microscopy. The layers are 
hexagonally arranged in all cases, with a similar centre-centre dimension (see: Walker, 
S.G., et al... (1992). "Isolation and Comparison of the Paracrystalline Surface Layer 
Proteins of Freshwater Caulobacters" J. Bacteriol. 174: 1783-1792). 
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16S rRNA sequence analysis of several S-layer producing Caulobacter strains 
show that they group closely (see: Stahl, D.A. et al. (1992) 'The Phylogeny of Marine 
and Freshwater Caulobacters Reflects Their Habitat" J. Bacteriol. 174: 2193-2198). 
DNA probing of Southern blots using the S-layer gene from C. crescentus CB15 
5 identifies a single band that is consistent with the presence of a cognate gene (see: 
MacRae, J.D. and, J. Smit. (1991) "Characterization of Caulobacters Isolated from 
Wastewater Treatment Systems" Applied and Environmental Microbiology 57:751- 
758). Furthermore, antisera raised against the S-layer protein of CB15 reacts against 
the S-layer protein of other Caulobacter (see: Walker, S.G. et al. (1992) [supra]). All 

10 S-layer proteins isolated from Caulobacter may be substantially purified using the same 
methods. All strains appear to have a polysaccharide species which may be required 
for S-layer attachment (see: Walker, S,G. et al. (1992) [supra]). 

The S-layers elaborated by freshwater isolates of Caulobacter are visibly 
indistinguishable from the S-layer produced by Caulobacter strains CB2 and CB15. 

15 The S-layer proteins from the latter strains have approximately 100,000 m,w. although 
sizes of S-layer proteins from other species and strains will vary. The hydrophillic S- 
layer protein has been characterized both structurally and chemically. It is composed of 
ring-like structures spaced at 22 nm intervals arranged in a hexagonal manner on the 
outer membrane. The S-Iayer is bound to the bacterial surface and may be removed by 

2 0 low pH treatment or by treatment with a calcium chelator such as EDTA. 

The similarity of S-layer proteins in different strains of Caulobacter permits the 
use of a cloned S-layer protein gene of one Caulobacter strain for retrieval of the 
corresponding gene in other Caulobacter strains (see: Walker, S.G. et al. (1992) 
[supra]; and MacRae, J.D. et al. (1991) [supra]). 
25 Expression of a heterologous polypeptide as a fusion product with the S-layer 

protein of Caulobacter provides advantages not previously seen in systems for 
production of recombinant fusion proteins using other organisms such as E. coli and 
Salmonella , Ail known Caulobacter strains are believed to be harmless and are nearly 
ubiquitous in aquatic environments. In contrast, many Salmonella and E. coli strains 

3 0 are pathogens. Consequently, expression and secretion of a heterologous polypeptide 

using Caulobacter as a vehicle has the advantage that the expression system will be 
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Stable in a variety of outdoor environments and niay not present problems associated 
with the use of a pathogenic organism. Furthermore, Caulobacter are natural biofilm 
forming species and may be adapted for use in fixed biofihn bioreators. The quantity of 
S-layer protein that is synthesized and is secreted by Caulobacter is high, reaching 12% 
5 of the cell protein. 

There is an existing need to produce pure proteins and peptides in an economical 
manner and in a m-anner that minimizes or simplifies the purification steps needed after 
fermentation. Key commercial areas^ include the production of recombinant human and 
animal therapeutic antibiotic and vaccine peptides,, industrial enzymes, protein 

10 polymiers, and antibacterial enzymes for foodstuffs. Many of these commercial 
applications require low production costs and diere are few expression systems available 
that can meet such cost restraints. In addition, there are numerous research applications 
where rapid methods to produce and purify proteins are needed to facilitate the 
discovery stage. This is especially true where there is a desire to express a- large 

15 number of proteins with unknown function (from a collections of cloned cDNA's, for 
example) or a large number of variants of a single protein, (for example, resulting from 
site directed mutagenesis) in a search for variants with improved properties. 

Generally, proteins must be secreted to be produced at low cost. The primary 
reason is the much reduced cost of purification of the target protein from cell material. 

20 However, even for secreted proteins, simple methods of separating the product from 
spent culture and cells are important for cost reduction and ease of use. 

An international patent application published as WO 97/34(XX) on September 18, 
1997 describes the expression and secretion of recombinant proteins from Caulobacter 
in which the recombmant protein is a fusion of all or part of Caulobacter S-layer protein 

25 with a heterologous protein of interest (also see: Bingle, W.H., et al. 1997* "Linker 
Mutagenesis of the Caulobacter us S-layer protein: Toward a Definition of an N- 
terminal Anchoring Region and a C-terminal Secretion Signal and the Potential for 
Heterologous Protein Secretion". J. Bacteriol. 179:601-611). 

The Caulobacter S-layer secretion apparatus is in the category of "Type 1" 

3 0 secretion usually found in pathogenic bacteria and noted for its ability to secrete a wide 
variety of proteins including large and hydrophillic proteins. The Caulobacter protein 
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secretion system is particularly useful to secrete recombinant proteins. 

The Caulobacter S-layer Type 1 secretion pathway requires only a C-terminal 
secretion signal, typically comprising about 200 amino acids at the end of the protein. 
The export mechanism is capable of tolerating a wide variety of foreign proteins. 
5 Recombinant proteins may be conveniently produced as fusion proteins with the target 
protein being fused to the C-terminal secretion signal. Depending on the application, it 
may be desirable to remove the secretion signal following secretion. Not removing the 
secretion signal may be an approach suitable for many subunit vaccine applications, 
where the remaining S-layer protein serves as a carrier. 

10 A unique and desirable feature of fusion proteins produced by the Caulobacter 

S-layer protein secretion system is that they form insoluble aggregates in the culture 
medium. This is apparentiy a consequence of the S-layer sequences associated with 
secretion signal and reflects the fact that the protein normally self-assembles into a two 
dimensional crystalline layer on the bacterium's surface. These aggregates are visible 

15 to the naked eye and are readily collected by simple filtration. With simple water wash 
steps, residual bacterial cells are readily flushed away. It is routinely possible to 
achieve a protein purity of 90% or better with this simple purification procedure. 

DESCRIPTION OF THE PRIOR ART 

20 

Most current protein purification systems for recombinant proteins produced by 
bacteria rely upon an affmity matrix to achieve separation of the target protein and to 
concentrate the protein for subsequent steps of purification. To accomplish this, genes 
for recombmant proteins are commonly constructed so that they contain affmity tags, 
25 which are protein sequences that will bind to an affinity matrix. Commonly used 
systems include the following: 

(a) glutathione S-transferase (GST) tag, which binds to glutathione-sepharose 
matrices; 

30 

(b) maltose binding protein (MB?) tag, which binds to amylose matrices; 
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(c) multiple landem histidine residues (e.g. "His-6") tag, which binds to 
nickel-derivatized solid mairices; and 

5 (d) protein A tag, which binds to Immunoglobulin IgG-derivatized sepharose or 
comparable matrices. 

Prior an techniques were typically developed so that removal of a target protein 
does not disrupt the tag and matrix association. Instead, enzymes that cleave specific 

10 sequences of amino acids are employed. The enzyme cleavage sequence is positioned 
between the tag and the desired recombinant protein and enzymatic cleavage is effected 
directly on the matrix with attached fusion protein. If a secretion signal is used, the 
cleavage site is usually positioned such that the secretion signal is separated from the 
target recombinant protein during the cleavage step. Tlie matrix is regenerated for re- 

15 use only after the target recombinant protein has been purified away from the matrix. 
Typical enzymes used in these methods are Factor Xa, enterokinase and coUagenase. 

Chemical cleavage is generally not used because the conditions requu-ed for 
cleavage will disrupt the binding of affinity tag and matrix or destroy the matrix. When 
chemical cleavage is used with recombinant fusion proteins to cleave target protein from 

20 a secretion signal and/or affinity tag, solubilization and denaturation processes are 
generally employed. The expectation is that complete or nearly complete unfolding of 
the protein is a prerequisite for effective cleavage. 

Mild-acid cleavage is predicated on the inclusion, by happenstance or design, of 
the acid-sensitive aspartate-proline dipeptide at a desired site for cleavage. The protein 

25 to be cleaved is typically exposed to conditions that solubilize and/or completely 
denature the protein prior to cleavage. The chaotropic agent guanidine hydrochloride 
(used at 6-7 M) is commonly employed to denanire and solubilize the protein prior to, 
or at the same time as acid treatment. Alternately, high concentrations of acids that also 
serve as solubilizing agents (as examples: 70-90% formic acid, acetic acid [10%] 

30 pyridine, or relatively high concentrations of HCL (60 mM or more) are employed. 
Because such conditions would disrupt a tag/affinity matrix association, direct cleavage 
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of an affinity tag from the target protein while a protein remains associated with an 
affinity matrix is not attempted. 

General conditions for cleavage at aspartate - proline sites are described in 
5 Current Protocols m Molecular Biology (supp. 28; chapter 16.4) John Wiley & Sons 
Inc. 1994, and in Landon, M. "Cleavage at Aspartyl - Prolyl Bonds" in Methods in 
Enzymology (1977) 47: 145-149. These references suggest that significant variability 
of cleavage conditions exist for different proteins and that cleavage might occur in some 
instances without fu-st denaturing or solubilizing the protein. However, in practice, the 

10 latter circumstances are rare and proteins to be subjected to acid cleavage at Asp-Pro 
dipeptides are usually solubilized to a state where there is no visible turbidity. Such 
solubilized protein will normally not pellet when centrifuged at 100,000 x g for 1 hour. 
It is now shown that mild-acid conditions may be used for cleavage of aspartate-proline 
sites in Caulobacter S-layer fusion proteins without placing the protein in a solubilized 

15 state as described above. 

SUMMARY OF INVENTION 

This invention is based on the unexpected discovery that recombinant fusion 
20 proteins produced by the Caulobacter S-layer protein secretion system can be cleaved 
under mild-acid conditions and solubilization of the fusion protein is not required. 
Cleavage may be accomplished while the fusion protein is in the form of an insoluble 
aggregate typical of the Caulobacter S-layer protein. Cleavage occurs at aspartate- 
protein dipeptides which may be in a heterologous protein portion of the fusion protein 
25 or in a portion that is native to the Caulobacter S-layer portion. The dipeptide may be 
placed at a desired location for cleavage by engineering DNA encoding the fusion 
protein to express the dipeptide at the desired location, A preferable location for 
cleavage may be at or near the junction between a heterologous (target) protein and the 
Caulobacter S-layer portion comprising the Caulobacter secretion signal, such that a 
3 0 cleavage product will be the target protein in its entirety and substantially free of 
extraneous amino acids. 
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The current invention makes it possible to cleave a heterologous (target) protein 
from the S-layer protein portion using only mild-acid conditions, even while the fusion 
protein is in an aggregated form. These cleavage conditions do not result in significant 
solubilization of the S-layer protein portion. 

5 

This invention provides a method of cleaving a fusion protein including a first 
component which comprises ail or part of a Caulobacter S-!ayer protein including a 
Caulobacter C-terminal secretion signal, and a second component heterologous to 
Caulobacter, The fusion protein contains at least one aspartate-proline dipeptide. The 

10 method comprises combining the fusion protein with an acid solution of a strength 
insufficient to solubilize the fusion protein for a time sufficient for cleavage of the 
fusion protein at the aspartate-proline dipeptide. The acid solution may have a pH of 
from about 1.5 (eg. 1.5 ± 0.1) to about 2.5 (eg. 2.5 ± 0.1), and preferably from about 
1.65 (eg. 1.65 ± 0.05) to about 2.35 (eg. 2.35 ± 0.05). Preferred pH conditions. may 

15 be achieved using an acid equivalent in the range of about 5 to about 20 mM HCL. 
The method is typically carried out at a temperature in the range of approximately room 
temperature to about 50°C. 

This invention also provides a method of preparing a DNA construct suitable for 
expression of a fusion protein suitable for use in the method of this invention. The 

2 0 method comprises joining an upstream DNA segment including DNA heterologous to 
Caulobacter which includes a protein of interest to a downstream DNA segment 
including DNA for a Caulobacter C-terminal secretion signal which does not encode an 
aspartate-proline dipeptide. The upstream segment contains DNA encoding an 
aspartate-proline dipeptide at or near the junction between said upstream and 

2 5 downstream segments . 

This invention also provides a method of preparing a fusion protein, comprising 
the steps of expressing a DNA construct as described above in Caulobacter and 
recovering said fusion protein once secreted by the Caulobacter. 

Once cleavage is accomplished according to this invention, the S-layer portion 

30 comprising the Caulobacter secretion signal may remain as an insoluble aggregate. If 
the target protein is soluble, the S-Iayer portion may be easily separated from the target 
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recombinant protein by simple centrifugation or filtration methods. Thus the system of 
this invention facilitates separation as would a Tag/affinity matrix system except that 
here, the system is also the means for producing an insoluble matrix. In addition, the 
insoluble matrix produced by this invention is resistant to the effects of the acid 
5 treatment, allowing direct cleavage of the target recombinant protein. In this way, a 
very inexpensive chemical cleavage method can be employed to economically retrieve 
recombinant proteins from a bacterial fusion protein. In contrast to die cost of most 
affiriity matrices, there is little expense associated with the use of the S-layer secretion 
signal as it is simply a part of the fermentation/secretion process. 

10 

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION 

Production of Recombinant Fusion Proteins Using 
the Caulobacter S-layer Secretion System 

15 

Proteins may be produced using the Caulobacter S-layer Type 1 secretion 
pathway which requires only the C-terminal secretion signal of the Caulobacter . This 
signal is the C-terminal portion of the S-layer protein, which typically comprises about 
200 amino acids. (See: Single. et_aL (1997) [supra]; and, WO 97/34000). Additional 

20 Caulobacter S-layer DNA upstream from the secretion signal may also be present and 
may be desirable to encode portions of the S-layer protein which will contribute to 
aggregate formation of the secreted protein. Such additional Caulobacter DNA may 
constimte most or all of the remainder of the DNA encoding the S-layer protein. 

Standard techniques (such as methods described in WO 97/34000) may be used 

25 to identify the amount of the C-terminal portion of a particular Caulobacter S-layer 
protein which functions as the secretion signal. 

Creation of fusion proteins is commonly done by preparing DNA which codes 
for the target protein and fusing it in-frame with the C-terminal region of the S-layer 
gene. There are numerous possible methods, with the following being examples. 

30 1. Oligonucleotide Chemical Synthesis. This involves the design of 
complementary single strands, complete with desirable restriction endonuclease cut sites 
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at the ends, chemical synthesis of the strands followed by annealing, cloning into a 
plasmid vector, juxtaposed to an appropriate portion of the C-terminal region of the S- 
layer gene. 

2. Production of the Target Gene DNA by Polymerase Chain Reaction (PCR) 
5 Amplification of a Target Sequence. In this case, appropriate in-frame restriction 

sites are incorporated into the short oligonucleotides used for amplification of a target 
sequence, such that the final PCR product can be treated with the appropriate restriction 
enzymes (to create the restriction site "sticky ends"), followed by cloning into a plasmid 
vector, juxtaposed to an appropriate portion of the C-terminal region of the S-layer 
10 gene. 

3. Adapting Restriction Endonuclease Cleavage Sit^ that are Native to a 
Target Protein Gene Sequence for Fusion to the DNA Coding for the C-terminal S- 
layer Secretion Signal to Accomplish In-frame Expression of a Chimeric Protein. 

15 This can be accomplished by direct ligation (although it is uncommon that an 
appropriate match will occur), or the use of adapter sequences or methods involving 
blunting of a restriction site and subsequent blunt-end ligation to change expression 
reading frame or join unlike restriction site sticky ends. 

There will be numerous convenient sites for fusion with the C-terminal regions 

20 of the S-layer that lead to the successful expression, secretion and aggregation of a 
recombinant fusion protein. Some example positions are at or near the DNA sites 
corresponding to amino acids 622, 690, 784, 892 and 907 of the C. crescentus S-iayer 
gene (see: Appendix 1 and, WO 97/34000). Other sites of fusion witii the S-layer gene 
may also be employ^. Most often a plasmid vector is designed such that the C- 

25 terminal gene segment is resident on a plasmid with appropriate restriction sites placed 
at the N-terminal junction of the S-layer fragment. Target recombinant protein gene 
segments are then cloned into those restriction sites. It is typical to prepare initial 
plasmid constructs that are replicated in E.coli ; After a construct is produced, it is 
typically transferred to a broad host range plasmid which can then be inu-oduced into 

3 0 the appropriate Caulobacter strain by electroporation. Suitable broad host range 
plasmids can be constructed from (but are not limited to) the IncQ, IncW and IncPl 
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plasmid incompatibility groups. 

The introduction of the aspartate-proline (Asp-Pro) dipeptide at the appropriate 
site in the fusion protein can be done in several ways. Some examples are: 

5 (a) incorporating a DNA sequence necessary to express the Asp-Pro 

dipeptide into the oligonucleotides used to prepare the target sequence, either by 
oligonucleotide synthesis or PCR methods; 

(b) preparing a DNA segment with appropriate restriction sites at the termini 
10 so that an Asp-Pro dipeptide can be introduced (most often at the junction between S- 

layer and target gene) after a fusion recombinant S layer gene has been made; and 

(c) use of a native Asp-Pro dipeptide in either the target DNA or the S-layer 
segment (for exan^)le, an Asp-Pro dipeptide is located at amino acids 692 and 693 of 
the C. crescentus S-layer gene and is suitable for fusions made at the amino acid site). 

15 The methods described above are not the only methods that may be used for 

creating and expressing fusion recombinant S-layer proteins, nor is it necessary to have 
the engineered genes resident on a plasmid. For example, the expressed gene may be 
introduced into the chromosome (using well-known gene insertion or replacement 
techniques) and still achieve secretion of the recombinant proteins (see WO 97/34000). 

20 In some cases ii may be desirable to produce recombinant fusion proteins as insertions 
of heterologous DNA in the middle of the S-layer gene. In such a case, Asp-Pro 
dipeptide sequences could be engineered at the N and C-termini of the target peptide. 

All possible codon combinations for Asp-Pro will work but the CCA codon for 
proline is not preferred due to the likelihood of a low amount of the corresponding 

25 tRNA being present in Caulobacter . The following is an approximate usage table for 
C. crescentus. 
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TABLE 1 



Cauiobacter eresc«ntus Codbn Ossgs TabSv 
(Amino Acid] fTriptet Code] (Frequency Per Thousand] 



10 



15 



Phe UUU 
Phe UUC 
Leu UUA 
Leu UUG 


2.5 
27.0 
0.0 . 
4.4 


SerUCU 
SerUCC 
SerCA 
SerUCG 


1.2 
8.5 
• 1.2 
25.7 


TryUAU 
Try UAC 
STOPUAA 
STOPUAG 


6.6 
9.6 
0.8 
0.6 


CysUGU 
CysUGC 
CysUGA 
STOP UGG 


Leu CUU 
Leu cue 
Leu CUA 
Leu CU6 


4.4 
15.7 
1.1 
72.3 


Pro ecu 
ProCCC 
Pro CCA 
ProCCG 


2.5 

15.5. . 
0.9 - 
27.1 


. HisCAU„ 
HisCAC" 
GInCAA 
Gin CAG 


3.2 

12.2 , 

3.7 

30.2 


ArgCGU 
Arg CGC 
A^CGA 
ArgCGG 


lleAUU 
lie AUG 
lieAUA 
Met AUG 


2.4 
49.0 
0.3 
25.7 


ThrACU 
ThrACC 
ThrACA 
ThrACG 


1.2 
37.3 
0.8 
16.8 


AsnAAU 
AsnAAC 
Lys AAA 
LysAAG^ 


4.1 
23.8 
2.7 
37.9 


SerAGU 
SerAGC 
ArgAGA 
ArgAGG 


Val GUU 
Val GUC 
Val GUA 
Val GUG 


5.4 
42.7 
1.0 
30.7 


AlaGCU 
AlaGCC 
AJaGCA 
AlaGCG 


9.5 
84.1 
2.2 
36.7 


AspGAU 
AspGAC 
GluGAA 
GluGAG 


11.1 
48.5 
20.5 
45.4 


Gly GGU 
Gly 6GC 
GlyGGA 
GlyGGG 



0.6 
5.5 
1.6 
7.2 

7.6 
44.7 
3.0 
12.1 

0.8 

14.9 

0.4 

1.1 

9.5 
64.8 
2.3 
7.7 



-1 
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Large quantities (eg. 12% of total cell protein/3% of input organic carbon) of a 
wide range of proteins can be produced, with yields in the order of 250 mg/liter of 
batch culture. Fusion proteins with 35 kDa of target peptide are secreted with litde 
5 difficulty, although proteins with multiple cysteines may be more difficult to express. 
Post-expression glycosylation of proteins does not occur, an advantage for most peptide 
expression applications. 

10 Host Expression Strains 

For secretion of recombinant fusion S-layer proteins, the Caulobacter strain will 
preferably be one which has lost the ability to produce a native S-layer protein, while 
retaining a fully functional S-layer protein secretion apparatus. Such strains may be 

15 obtained by screening for mutants that have spontaneously become S-layer protein 
negative; or, by directed genetic manipulation, such as (but not limited to) the insertion 
of a drug resistance cassette in the middle of the S-layer gene or the substitution of a 
version of the S-layer gene which has had a sizeable internal region deleted from the 
gene (see: Single et al. 1997* [supra]; Bingle et al. 1997^ "Cell Surface Display of a 

20 Pseudonomonas aerugenosa PAK Pilin Peptide with the Paracrystalline Layer of 
Caulobacter crescentus " Molec. Microbiol. 26:277-288; and, Edwards and Smit (1991) 
" A Transducing Bacteriophage for Caulobacter us Uses the Paracrystalline Surface 
Layer Protein as a Receptor" J. Bacteriol. 173: 5568-5572). In the case of a genetic 
manipulation, a common method for producing such strains is to modify a copy of the 

25 S-layer gene while on a plasmid and then to use well known gene replacement methods 
to substitute the modified gene for the native gene in the Caulobacter chromosome (see: 
Edwards and Smit (1991) [supra]). 

If an entire S-layer gene is to be used for production of a recombinant protein 
(via insertion of a target sequence), strains defective in the production of the 

3 0 lipopolysacharide (LPS) used for S-layer attachment to the bacterial surface can be 
used. These can be prepared by forcing Caulobacter to grow without exogenous 
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calcium. Under these conditions mutants arise that are unifomaly defective in 
producing a proficient version of the S-layer LPS (see: Walker, S.G, et ah (1994) 
"Characteristics of Mutants of Caulobacter crescentus Defective in Surface Attachment 
of the Paracrystaline Layer" J. Bacterioi. 176: 6312-6323). 
5 All Caulobacter S-layer producing strains are suitable for this technology. One 

may isolate the S-layer gene from a _ particular strain (using homology between 
Caulobacter S-layers to design probes to detect and clone the S-layer genes) and adapt 
the C-terminal region for recombinant protein expression, in a manner similar to that 
done for C. crescentus strains (see: MacRae and Smit (1991) [supra], and Walker, S.G. 

10 et al. (1992) [supra])^ Alternatively, one may construct recombinant fusion S-layer 
genes using the C. crescentus S-layer gene and express the recombinant genes in 
alternate Caulobacter hosts. 

Freshwater Caulobacter producing S-layers may be readily detected by negative 
stain transmission electron microscopy techniques. Caulobacter may be isolated using 

15 the methods outlined by MacRae and Smit (1991) [supra], which take advantage of the 
fact that Caulobacter can tolerate periods of starvation while other soil and water 
bacteria may not and that they all produce a distinctive stalk structure, visible by light 
microscopy (using either phase contrast or standard dye staining methods). Once 
Caulobacter strains are isolated in a typical procedure, colonies may be suspended in 

20 2% ammonhim molybdate negative stain and applied to plastic-filmed, carbon-stabilized 
300 or 400 mesh copper or nickel grids and examined in a transmission electron 
microscope at 60 kilovolt accelerating voltage (see: Smit, J. (1986) "Protein Surface 
Layers of Bacteria", in Outer Membranes as Model Systems , (M, Inouge. ed. J.Wiley 
& Sons, at p. 343-376). S-layers are seen as two-dimensional geometric patterns most 

25 readily on those cells in a colony that have lysed and released their internal contents. 

Recombinant Protein Puriflcation 



Secreted proteins are separated and shed into the culture media as a macroscopic 
3 0 precipitate (the "aggregate" referred to herein). The shedding phenomenon is a 
consequence of the absence of the N-terminal region of the S-layer protein in the 
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expressed recombinant protein, or the loss of the lipopolysaccharide species used for S- 
layer attachment by the Caulobacter (see: Walker, S.G, et al. (1994) [supra]). 
Typically, the aggregate forms as loose, geHike lumps of pure protein that can readily 
be retrieved and separated from the bacteria by simple filtration, 
5 The aggregate may be readily separated from a soluble cleaved target protein by 

any suitable techniques such as filtration of centrifugation. If the target protein is 
insoluble once cleaved, it may then be convenient to then solubilize one or both of the 
proteins (for example in 8M urea or 6M quanidine HCL) and separate by 
chromatography. In this way, only 2 species of protein need to be separated. 
10 ■ ^ ^ 

Cleavage ofEusion Proteins 

General procedures for performing mild-acid cleavage are known from in the 
prior art as described above. In the method of this invention, conditions are adjusted to 
15 avoid destruction of the target protein or solubilization of the aggregate containing the 
S-layer secretion signal. Excess acid or too high a temperature may increase the 
occurrence over time of random cleavages along the length of the fusion protein, which 
is to be avoided since such random cleavages may lead to undersized fragmentation of 
the fusion protein or solubilization of the aggregated S-layer portion. 

20 

Good yields of target protein with minimum random breaks in the fusion protein 
may generally be achieved by using from 5-20 mM HCL (or its equivalent while 
employing another acid). The respective pH of these conditions (unbuffered acid 
solution) is from about 2.3 to about 1.7. Time and temperature is preferably adjusted 

25 by routine monitoring to achieve the desired cleavage while minimizing random breaks. 
For example, temperature may range from room temperature to about 50^ C. Time of 
treatment may range from about 12 to about 72 hours. Time or temperature outside of 
these ranges is permissible depending upon the strength of the acid and the accepted 
yield. Generall>\ lower yields are obtained with less acid so-ength, less time or lower 

30 temperatures. 

In the following examples, efficiency of cleavage in the order of 40-80% is 
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achieved using conditions the same as or similar to the following alternatives: 

- 5 mM HCL at 50*^ C. for 48-72 hours 

- 20 mM KCL ai 3(r C. for 48-72 hours. 

Conditions in excess of the aforementioned values may be employed in some 
5 cases with the possibility of random breaks increasing, particularly with increased acid 
strength or temperature. In the following examples, significant random cleavage 
occurred with 50 mM KCL at 50^ C. after 48 hours. 

Any acid may be employed in this invention which is normally used in solutions 
to which proteins are exposed. Acids which have a deleterious effect on proteins under 
10 dilute conditions should be avoided. For example, HCL or an equivalent amount of 
H2SO4 may be used in this invention but oxidizing acids such as nitric acid may not be 
suitable. 

Example 1 . Cleavage of artifldal silk protein sequences 
15 from a secretion signal containing a native aspartate-proline cleavage site. 

An artificial protein sequence resembling spider silk was constructed by 
synthesis of panially overlapping and complementing oligomers of DNA, which were 
then completed to a full duplex DNA with Taql polymerase extension, to create a 

20 sequence that coded for 97 amino acids. The resulting DNA sequence and 
corresponding amino acid sequence are shown in Appendix 2. 

The DNA sequence shown in Appendix 2 was cloned into a gene carrier 
sequence residing in a pUC8 plasmid cloning vector. The gene segment carrier had 
BamHl restriction sites at each end and an internal Bglll site. This combination of 

25 restrictions sites allowed the production of multimers of the above sequence, relying on 
the fact that BamHl sticky ends will ligate into Bglll sticky end, with the loss of both 
restriction siteis. Thus one copy of the silk-like sequence within the gene segment 
carrier can be put inside a second copy of the same to produce a dimer. Using this 
principle, an 8X repeat was produced, fused to DNA encodmg the S-layer secretion 

3 0 signal corresponding to the C-terminal portion of the C. crescentus S-layer protein from 
about amino acid 690 onwards (see: Appendix 1). This fusion protein gene was 
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introduced into strain CB2A on a broad host range plasmid vector. The 8x multimer 
appeared to be unstable, resulting in recombination events that reduced the 8X multimer 
to a 3x size. The 3 fold repeat of the above 97 amino acid sequence, fused to the S- 
layer secretion signal was secreted. Protein was collected and subjected to treatme:at 
5 with 5mM HCL for 2 days at 50° C. The result was the liberation of about 80% of 
soluble silk-like polymer which was readily separated by filtration from the S-layer 
protein which remained completely aggregated under these conditions. Cleavage 
occurred at native aspartate-proline dimer in the Caulobacter S-layer signal region (see: 
Appendix 1, aniino. acids numbered 692-693). 

10. 

Example 2> Cleavage of the salmonid virus Infectious Pancreatic Necrosis 
Virus (IPNV) surface glycoprotein candidate vaccine sequence from an 
S-layer secretion signal containing a native aspartate-proline site. 

15 

The surface glycoprotein of the IPNV surain is a vaccine candidate. For this 
example and Example 4, the sequence of the first 257 amino acids of the mature protein 
and the corresponding DNA sequence as shown in Appendix 3 were used. 

DNA encoding a segment of the major surface glycoprotein gene of IPNV 

20 specifying amino acids 145-257 of the protein was fused to DNA sequence specifying 
two putative T-cell activating epitopes: MVF (SEQ ID No:l; LSEKGVIVHRLEGV, 
derived from Measles Virus protein F) and P2 (SEQ ID No:2; QYIKANSKFIGITEL, 
derived from tetanus toxoid protein). The T-cell epitopes were positioned on the C- 
terminal end of the IPNV sequence. This chimeric protein was in turn fused in frame 

25 with the C-crescentus S-layer gene at about amino acid 690 position of the gene and 
introduced into Caulobacter on a broad host range plasmid vector. The resulting 
secreted protein was collected and treated with 5 mM HCL for 2 days at 50° C. 
Cleavage occurred at the native aspartate-prohne dimer described in Example 1. The 
result was the liberation of about 75% of soluble vaccine candidate chimeric protein 

3 0 from the S-layer secretion signal which remained aggregated. 
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Example 3> Cleavage of segments of an E. coli type I pilus tip submiit from 

an S-layer secretion signal containing a native aspartate-proline cleavage site« 

5 The FimH gene product is the tip pilus subunit of the E; coli strains involved 

with urinary tract infections. Two segments. T3 (specifying the first 145 amino acids 
of the mature peptide) and T7 (specifying the entire 258 amino acids of the mature 
peptide) were fused to the S-layer secretion signal at about amino acid 690 of the 
S-layef sequence. The T3 and T7 sequbnceis are shown in Appendix 4. 

10 The fusion protein genes were introduced into strain CB2A on a broad host 

range plasmid vector. In both cases the resulting secreted protein was collected and 
treated witti 5 mM HCL for 2 dafys at 50° C. In both cases, the result was the liberation 
of about 50% of soluble vaccine candidate chimeric protein from the S-layer secretion 
signal which remained aggregated. Cleavage occurred at the native aspartate-proline 

15 dimer described in Example 1. 

Example 4 . Cleavage of the salmonid virus IPNV surface glycoprotein 
candidate vaccine sequence from an S-layer secretion signal containing 
an introduced aspartate-proline cleavage site. 

20 

A segment of the major surface glycoprotein gene of IPNV specifying amino 
acids 1-257 of the protein shown in Appendix 4 was fused to a DNA sequence 
specifying a peptide containing an aspartate-proline dipeptide (SEQ ID No: 3; 
SPLGPAGDPEAS) such that the aspartate-proline dipeptide was positioned very near 

25 the C-terminus of the chimeric protein. This chimeric protein was in turn fused in 
frame with the C. crescentus S-layer gene at about amino acid 784 position of the gene 
and introduced in strain CB2A on a broad host range plasmid vector. The resulting 
secreted protein was collected and treated with 5 mM HCL for 2 days at 50° C. 
Cleavage occurred at the introduced aspartate-proline dipeptide. The result was the 

30 liberation of about 40% of insoluble vaiicine candidate chimeric protein from the S- 
layer secretion signal which remained aggregated. 

Longer DNA and amino acid sequences referred to above are set out in the 
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following Appendices which are part of this description. Appendix 1 sets out the 
complete nucleotide sequence of the C. crescentus S-layer gene (SEQ ID No: 4) with 
the upstream sequence including the -35 and -10 sites of the promoter region and the 
Shine Dalgamo sequence. The start codon is at nucleotide 101 and the coding sequence 
5 run to and includes nucleotide 3179. Hie amino acid sequence of the C. crescentus S- 
layer protein (SEQ ID No: 5) included in Appendix 1 is predicted from the DNA 
sequence. Appendix 2 sets out the artificial spider silk DNA sequence (SEQ ID No:6) 
used in Example 1 and the corresponding amino acid sequence (SEQ ID No. 7). 
Appendix 3 sets out the DNA sequence (SEQ ID No: 8) and corresponding amino acid 

10 sequence (SEQ ID No: 9) of the first 257 amino acids of IPNV as described in 
Examples 2 and 4. Appendix 4 sets out the T3 protein sequence (SEQ ID No: 10) and 
the T7 protein sequence (SEQ ID No: 11) as described in Example 3. 

All publications, patents and patent applications referred to herein are hereby 
incorporated by reference. While this invention has been described according to 

15 particular embodiments and by reference to certain examples, it will be apparent to 
those of skill in the art that variations and modifications of the invention as described 
herein fall within the spirit and scope of the attached claims. 
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GCTATTGTCG 


ACCTTATGACG 


TTTGCTCTAT 


AGCCATCGCT . GCTCCCATGC 


GCGCCACTCG 


60 


GTCGCAGGGG 


GTGTGGGATT 


TTTTTTGGGA 


GACAATCCTC ATGGCCTATA 


CGACGGCCCA 


120 


GTTGGTGACT 


GCGTACACCA 


ACGCCAACCT 


CGGCAAGGCG CCTGACGCCG 


CCACCACGCT 


180 


GACGCTCGAC 


GCGTACGCGA 


CTCAAACCCA 


gacx;ggcggc ctctcggacg 


CCGCTGCGCT 


240 


GACCAACACC 


CTGAAGCTGG 


TCAACAGCAC 


GACGGCTGTT 


GCCATCCAGA 


CCTACCAGTT 


300 


CTTCACCGGC 


GTTGCCCCGT 


CGGCCGCTGG 


tctggacttc 


CTGGTCGACT 


CGACCACCAA 


360 


CACCAACGAC 


CTGAACGACG 


CGTACTACTC 


GAAGTTCGCT CAGGAAAACC 


GCTTCATCAA 


420 


CTTCTCGATC 


AACCTGGCCA 


CGGGCGCCGG 


CGCCGGCGCG ACGGCTTTCG 


CCGCCGCCTA 


480 


CACGGGCGTT 


TCGTACGCCC 


AGACGGTCGC 


CACCGCCTAT 


GACAAGATCA 


TCGGCAACGC 


540 


CGTCGCGACC 


GCCGCTGGCG 


TCGACGTCGC 


GGCCGCCGTG GCTTTCCTGA 


GCCGCCAGGC 


600 


CAACATCGAC 


TACCTX3ACCG 


CCTTCGTGCG 




ppr*T^p* 


CCGCTGCCGA 


660 


CATCGATCTG 


GCCGTCAAGG 


CCGCCCTGAT 


CGGCACCATC 


CTGAACGCCG 


CCACGGTGTC 


720 


GGGCATCGGT 


GGTTACGCGA 


CCGCCACGGC 


CGCGATGATC 


AACGACCTGT 


CGGACGGCGC 


780 


CCTGTCGACC 


GACAACGCGG 


CTGGCGTGAA 


CCTGTTCACC 


GCCTATCCGT 


CGTCGGGCGT 


840 


GTCGOGTTCG 


ACCCTCTCGC 


TGACCACCGG 


CACCGACACC 


CTGACGGGCA 


CCGCCAACAA 


900 


CGACACGTTC 


GTTGCGGGTG 


AAGTCGCCGG 


CGCTGCGACC 


CTGACCGTTG 


GCGACACCCT 


960 


GAGCGGCGGT 


GCTGGCACCG 


ACGTCCTGAA 


CTGGGTGCAA 


GCTGCTGCGG 


TTACGGCTCT 


1020 


GCCGACCGGC 


GTGACGATCT 


CGGGCATCGA 


AACGATGAAC 


GTGACGTCGG 


GCGCTGCGAT 


1080 


CACCCTGAAC 


ACGTCTTCGG 


GCGTGACGGG 


TCTGACCGCC 


CTGAACACCA 


ACACCAGCGG 


1140 


CGCGGCTCAA 


ACCGTCACCG 


CCGGCGCTGG 


CCAGAACCTG 


ACCGCCACGA 


CCGCCGCTCA 


1200 


AGCCGCGAAC 


AACGTCGCCG 


TCGACGGGCG 


CGCCAACGTC 


ACCGTCGCCT 


CGACGGGCGT 


1260 


GACCTCGGGC 


ACGACCACGG 


TCGGCGCCAA 


CTCGGCCGCT 


TCGGGCACCG 


TGTCGGTGAG 


1320 


CGTCGCGAAC 


TCGAGCACGA 


CCACCACGGG 


CGCTATCGCC 


GTGACCGGTG 


GTACGGCCGT 


1380 


GACCGTGGCT 


CAAACGGCCG 


GCAACGCCGT 


GAACACCACG 


TTGACGCAAG 


CCGACGTGAC 


1440 


CGTGACCGGT 


AACTCCAGCA 


CCACGGCCGT 


GACGGTCACC 


CAAACGGCCG 


CCGCCACCGC 


1500 


CGGCGCTACG 


GTCGCCGGTC 


GCGTCAACGG 


CGCTGTGACG 


ATCACCGACT 


CTGCCGCCGC 


1560 


CTCX3GCCACG 


ACCGCCGGCA AGATCGCCAC 


GGTCACCCTG 


GGCAGCTTCG 


GCGCCGCCAC 


1620 


GATCGACTCG 


AGCGCTCTGA 


CGACCGTCAA 


CCTGTCGGGC 


ACGGGCACCT 


CGCTCGGCAT 


1680 
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CGGCCGCGGC GCTCTGACCG CCACGCCGAC CGCCAACACC CTGACCCTGA ACGTCAATOG 1740 
TCTGACGACG ACCGGCGCQA TCACGGACTC GGAAGCGGCT GCTGACGATX3 GTITCACCAC 1800 
CATCAACATC GCTGGTTCGA CCGCCTCTTC GACGATCGCC AGCCTGGTX3G CCGCCGACGC 1860 
GACGACCCTG AACATCTCGG GCGACGCTCG CGTCACGATC ACCTCGCACA CCGCTGCCGC 1920 
CCTGACGGGC ATCACGGTGA CCAACAGCGT TGGTGCGACC CTCGGCGCCG AACTGGCQAC 1980 
CGGTCTGGTC TTCACGGGCG GCGCTGGCCG TGACTCGATC CTGCTGGGCG CCACGACCAA 2040 
GGCGATCGTC ATGGGCGCCG GCGACGACAC CGTCACCGTC AGCTCGGCGA CCCTGGGCGC 2100 
TGGTGGTTCG GTCAACGGCG GCGACGGCAC CGACGTTCTG GTGGCCAACG TCAACGGTTC 2160 
GTCGTTCAGC GCTGACCCGG CCTTCGGCGG CTTCGAAACC CTCCGCGTCG CTGGCGCGGC 2220 
GGCTCAAGGC TCGCACAACG CCAACGGCTT CACGGCTCTG CAACTGGGCG CGACGGCGGG 2280 
TGCGACGACC TTCACCAACG TTGCGGTGAA TGTCGGCCTG ACCGTTCTGG CGGCTCCGAC 234 0 
CGGTACGACG ACCGTGACCC TGGCCAACGC CACGGGCACC TCGGACGTGT TCAACCTGAC 2400 
CCTGTCGTCC TCGGCCGCTC TGGCCGCTGG TACGGTTGCG CTGGCTGGCG TCGAGACGGT 2460 
GAACATCGCC GCCACCGACA CCAACACGAC CGCTCACGTC GACACGCTGA CGCTGCAAGC 2520 
CACCTCGGCC AAGTCGATCG TGGTGACGGG CAACGCCGGT CTGAACCTGA CCAACACCGG 2580 
CAACACGGCT GTCACCAGCT TCGACGCCAG CGCCGTCACC GGCACGGCTC CGGCTGTGAC 264 0 
CTTCGTGTCG GCCAACACCA CGGTGGGTGA AGTCGTCACG ATCCGCGGCG GCGCTGGCGC 2700 
CGACTCGCTG ACCGGTTCGG CCACCGCCAA TGACACCATC ATCGGTGGCG CTGGCGCTGA 2760 
CACCCTGGTC TACACCGGCG GTACGGACAC CTTCACGGGT GGCACGGGCG CGGATATCTT 2820 
CGATATCAAC GCTATCGGCA CCTCGACCGC TTTCGTGACG ATCACCGACG CCGCTGTCGG 2880 
CGACAAGCTC GACCTCGTCG GCATCTCGAC GAACGGCGCT ATCGCTGACG GCGCCITCGG 2940 
CGCTGCGGTC ACCCTGGGCG CTGCTGCGAC CCTGGCTCAG TACCTGGACG CTGCTGCTGC 3000 
CGGCGACGGC AGCGGCACCT CGGTTGCCAA GTGGTTCCAG TTCGGCGGCG ACACCTATGT 3060 
CGTCGTTGAC AGCTCGGCTG GCGCGACCTT CGTCAGCGGC GCTGACGCGG TGATCAAGCT 3120 
GACCGGTCTG GTCACGCTGA CCACCTCGGC CTTCGCCACC GAAGTCCTGA CGCTCGCCTA 3180 
AGCGAACGTC TGATCCTCGC CTAGGCGAGG ATCGCTAGAC TAAGAGACCC CGTCTTCCGA 324 0 
AAGGGAGGCG GGGTCTTTCT TATGGGCGCT ACGCGCTGGC CGGCCTTGCC TAGTTCCGGT 3300 
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Met Ala Tyr Thr Thr Ala Gin Leu Val Thr Ala Tyr Thr Aan Ala Aan 
15 10 15 

Leu Gly Lys Ala Pro Asp Ala Ala Thr Thr Leu Thr Leu Asp Ala Tyr 

20 25 30 

Ala Thr Gin Thr Gin Thr Gly Gly Leu Ser Asp Ala Ala Ala Leu Thr 
35 40 45 

Asn Thr Leu LyB Leu Val Asn Ser Thr Thr Ala Val Ala He Gin Thr 
50 55 60 

Tyr Gin Phe Phe Thr Glv Val Ala Pro Ser Ala Ala Gly Leu Asp Phe 
65 70" 75 80 

Leu Val Asp Ser Thr Thr Asn Thr Asn Asp Leu Asn Asp Ala Tyr Tyr 
85 90 95 

Ser Lys Phe Ala Gin Glu Asn Arg Phe He Asn Phe Ser He Asn Leu 
100 105 110 

Ala Thr Gly Ala Gly Ala Gly Ala Thr Ala Phe Ala Ala Ala Tyr Thr 
115 120 125 

Gly Val Ser Tyr Ala Gin Thr Val Ala Thr Ala Tyr Asp Lys He He 
130 135 140 

Gly Asn Ala Val Ala Thr Ala Ala Gly Val Asp Val Ala Ala Ala Val 
145 150 155 160 

Ala Phe Leu Ser Arg Gin Ala Asn He Asp Tyr Leu Thr Ala Phe Val 
165 170 175 

Arg Ala Asn Thr Pro Phe Thr Ala Ala Ala Asp He Asp Leu Ala Val 
180 165 190 

Lys Ala Ala Leu He Gly Thr He Leu Asn Ala Ala Thr Val Ser Gly 
195 200 205 

He Gly Gly Tyr Ala Thr Ala Thr Ala Ala Met He Asn Asp Leu Ser 
210 215 220 

Asp Gly Ala Leu Ser Thr Asp Asn Ala Ala Gly Val Asn Leu Phe Thr 
225 230 235 240 

Ala Tyr Pro Ser Ser Gly Val Ser Gly Ser Thr Leu Ser Leu Thr Thr 
245 250 255 

Gly Thr Asp Thr Leu Thr Gly Thr Ala Asn Aen Asp Thr Phe Val Ala 
260 265 270 

Gly Glu Val Ala Gly Ala Ala Thr Leu Thr Val Gly Asp Thr Leu Ser 
275 280 285 

Gly Gly Ala Gly Thr Asp Val Leu Asn Trp Val Gin Ala Ala Ala Val 
290 295 300 

Thr Ala Leu Pro Thr Gly Val Thr He Ser Gly He Glu Thr Met Asn 
305 310 315 320 

Val Thr Ser Gly Ala Ala He Thr Leu Asn Thr Ser Ser Gly Val Thr 
325 330 335 

Gly Leu Thr Ala Leu Asn Thr Asn Thr Ser Gly Ala Ala Gin Thr Val 
340 345 350 
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Thr Ala Gly Ala Gly Gin Asn Leu Thr Ala Thr Thr Ala Ala Gin Ala 
355 360 365 

Ala Asn Asn Val Ala Val Asp Gly Arg Ala Asn Val Thr Val Ala Ser 
370 375 380 

Thr Gly Val Thr Ser Gly Thr Thr Thr Val Gly Ala Asn Ser Ala Ala 
390 395 400 



385 



ser Gly Thr Val Ser Val Ser Val Ala Asn Ser Ser Thr Thr Thr Thr 
405 

Gly Ala He Ala Val Thr Gly Gly Thr Ala Val Thr Val Ala Gin Thr 
420 425 •430 

Ala Gly Asn Ala Val Asn Thr Thr Leu Thr Gin Ala Asp Val Thr Val 
435 440 445 

Thr Gly Asn Ser Ser Thr Thr Ala Val Thr Val Thr Gin Thr Ala Ala 
450 455 460 

Ala Thr Ala Gly Ala Thr Val Ala Gly Arg Val Asn Gly Ala Val Thr 
465 470 475 480 

He Thr Asp Ser Ala Ala Ala Ser Ala Thr Thr Ala Gly Lys lie Ala 
485 490 495 

Thr Val Thr Leu Gly Ser Phe Gly Ala Ala Thr He Asp Ser Ser Ala 
500 505 510 

Leu Thr Thr Val Asn Leu Ser Gly Thr Gly Thr Ser Leu Gly lie Gly 
515 520 525 

Arq Gly Ala Leu Thr Ala Thr Pro Thr Ala Aan Thr Leu Thr Leu Asn 
530 535 540 

Val Asn Gly Leu Thr Thr Thr Gly Ala He Thr Asp Ser Glu Ala Ala 
545 550 555 560 

Ala Asp Asp Gly Phe Thr Thr He Asn He Ala Gly Ser Thr Ala Ser 
565 570 575 

Ser Thr He Ala Ser Leu Val Ala Ala Asp Ala Thr Thr Leu Asn He 
580 585 590 

Ser Gly Asp Ala Arg Val Thr He Thr Ser His Thr Ala Ala Ala Leu 
595 600 605 

Thr Gly He Thr Val Thr Asn Ser Val Gly Ala Thr Leu Gly Ala Glu 
610 615 620 

Leu Ala Thr Gly Leu Val Phe Thr Gly Gly Ala Gly Arg Asp Ser He 
625 630 635 640 

Leu Leu Gly Ala Thr Thr Lys Ala He Val Met Gly Ala Gly Asp Asp 
645 650 655 

Thr Val Thr Val Ser Ser Ala Thr Leu Gly Ala Gly Gly Ser Val Asn 
660 665 670 

Glv Gly Asp Gly Thr Asp val Leu Val Ala Asn Val Asn Gly Ser Ser 
675 680 685 

Phe Ser Ala Asp Pro Ala Phe Gly Gly Phe Glu Thr Leu Arg Val Ala 
690 695 700 



wo 00/04 J 70 



23 

Appendix 1 (cont'd) 



PCT/CA99/00O7 



Glv Ala Ala Ala Gin Gly Ser His Asn Ala Asn Gly Phe Thr Ala Leu 

705 710 715 720 

Gin Leu Gly Ala Thr Ala Gly Ala Thr Thr Phe Thr Asn Val Ala Val 
725 730 735 

Asn Val Gly Leu Thr Val Leu Ala Ala Pro Thr Gly Thr Thr Thr Val 
740 745 750 

Thr Leu Ala Asn Ala Thr Gly Thr Ser Asp Val Phe Asn lieu Thr Leu 
755 760 765 

Ser Ser Ser Ala Ala Leu Ala Ala Gly Thr Val Ala Leu Ala Gly Val 
770 775 780 

Glu Thr Val Asn He Ala Ala Thr Asp Thr Asn Thr Thr Ala His Val 
785 790 795 800 

Asp Thr Leu Thr Leu Gin Ala Thr Ser Ala Lys Ser He Val Val Thr 
805 - 810 815 

Gly Asn Ala Gly Leu Asn Leu Thr Asn Thr Gly Asn Thr Ala Val Thr 
820 825 830 

Ser Phe Asp Ala Ser Ala Val Thr Gly Thr Ala Pro Ala Val Thr Phe 
835 840 845 

Val Ser Ala Asn Thr Thr Val Gly Glu Val Val Thr He Arg Gly Gly 
850 855 860 

Ala Gly Ala Asp Ser Leu Thr Gly Ser Ala Thr Ala Asn Asp Thr He 
865 870 875 880 

He Gly Gly Ala Gly Ala Asp Thr Leu Val Tyr Thr Gly Gly Thr Asp 
885 890 895 

Thr Phe Thr Gly Gly Thr Gly Ala Asp He Phe Asp He. Asn Ala He 
900 905 910 

Gly Thr Ser Thr Ala Phe Val Thr He Thr Asp Ala Ala Val Gly Asp 
915 920 925 

Lys Leu Asp Leu Val Gly He Ser Thr Asn Gly Ala He Ala Asp Gly 
930 935 940 

Ala Phe Gly Ala Ala Val Thr Leu Gly Ala Ala Ala Thr Leu Ala Gin 
945 950 955 960 

Tyr Leu Asp Ala Ala Ala Ala Gly Asp Gly Ser Gly Thr Ser Val Ala 
965 970 975 

Lys Trp Phe Gin Phe Gly Gly Asp Thr Tyr Val Val Val Asp Ser Ser 
980 985 990 

Ala Gly Ala Thr Phe Val Ser Gly Ala Asp Ala Val He Lys Leu Thr 
995 1000 1005 

Gly Leu Val Thr Leu Thr Thr Ser Ala Phe Ala Thr Glu Val Leu Thr 
1010 1015 1020 

Leu Ala 
1025 
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GAA TTC AGA TCT CAG GGC GCG GGG CAG GGT GGC TAT GGT GGG CTC GGC 

TCG CAA GGC 

GCT 

EFRSQGAGQGGYGGLGSQGA 

GGC CTG GGT GGC CAG GGC GCT GGC GCG GCC GCG GCC GCT GCG GCC GGT 
GGC 

GRGGQGAGAAAAAAAGG 

GCT GGC CAG GGC GGG CTG GGC TCG CAG GGC GCC GGC CAA GGC GCT GGC 

GCC GCG GCC 

GCT 

A G Q G G L G S Q G A G Q GAGA A A A 

GCG GCC GGT GGC GCC GGC CAG GGT GGC TAC GGC GGC CTG GGC AGC CAG 

GGC GCC GGT 

CGC 

AAGGAGQG G Y GGLGSQGAGR 

GGC GGT CAG GGC GCC GGT GCC GCG GCC GCT GCG GCC GGT GGC GCT GGG 
CAA GGC GGC TAC 

GGQGAGAAAAAAGGAGQGGY 

GGC GGT CTG GGA TCC 
G G L G S 
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atg aac aca aac aag gca acc gca act lac ttg aaa tec att atg ctt cca gag act 
l^Bt asn thf asn lys ala thr aid thr tyr leu lys ser ile met leu pro glu thr 

cca gca age ate ccg gac gac ata aeg gag aga cac ate tta aaa caa gag acc teg 
tea 

pro ala ser ile pro asp asp ile thr glu arg his ile leu lys gin glu thr ser 
ser 

121/41 

tac aac tta gag gtc tec gaa tea gga agt ggc att ctt gtt tgt ttc ect ggg gca 
cca 

tyr asn leu glu val ser glu ser gly ser gly ile leu val eys phe pro gly ala 

pro 

181/61 

ggc tea egg ate ggt gca cac tac aga tgg aat grg aac cag aeg ggg ctg gag ttc 
gac 

gly ser arg ile gly ala his tyr arg trp asn ala asn gin thr gly leu glu phe 

asp 

241/81 

cag tgg ctg gag aeg teg cag gac ctg aag aaa gee ttc aac tac ggg agg ctg ate 
tea 

gin trp leu glu thr ser gin asp leu lys lys ala phe asn tyr gly arg leu He 
ser 

301/101 

agg aaa tac gac att caa age tec aca eta ccg gee ggt etc tat get ctg aac ggg 
aeg 

arg lys tyr asp Ile gin ser ser thr leu pro ala gly leu tyr ala leu asn gly 
thr 

361/121 

etc aac get gee acc ttc gaa ggc agt ctg tct gag gtg gag age ctg acc tac aat 
age 

leu asn ata ala thr phe glu gly ser leu ser glu val glu ser leu thr tyr asn 
ser 

421/141 

ctg atg tee eta act aeg aac ecc cag gac aaa gee aac aac cag ctg gtg ace aaa 

leu met ser leu thr thr asn pro gin asp lys ala asn asn gin leu val thr lys 
gly 

481/161 

gtc ace gtc ctg aat eta cca aca ggg ttc gac aaa cca tac gtc cgc eta gag gac 
gag 

val thr val leu asn leu pro thr gly phe asp lys pro tyr val arg leu glu asp 
glu 

541/181 

aca ecc cag ggt etc cag tea atg aac ggg gee agg atg agg tgc aca get gca att 
gca 

thr pro gin gly leu gin ser met asn gly ala arg met arg cys thr ala ala ile 
ala 

601/201 

cca egg agg tac gag ate gac etc cca tec caa age eta cec cec gtt ect gcg aca 

gga 

pro arg arg tyr glu ile asp leu pro ser gin ser leu pro pro val pro ala thr 
U^f22^ 

ace etc acc act etc tac gag gga aac gee gac ate gtc age tec aca aca gtg aeg 
gga 

thr leu thr thr leu tyr glu gly asn ala asp ile val ser ser thr thr vat thr 
?^1/241 

gac ata aac ttc agt ctg gca gaa cga cec gca aac gag ace agg ttc gac ttc cag 
ctg 

asp ile asn phe ser leu ala glu arg pro ala asn glu thr arg phe asp phe gin 
leu 
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The T3 protein sequence is: 

FACKTANGTAIPIGGGSANVYVNLAPWNVGQNLWDLSTQIFCHNDYPETITDYVTLQRGSA 
SYPFPTTSETPRWYNSRTDKPWPVALYLTPVSSAGGVAIKAGSLIAVLILRQTNNYNSDDFQ 
CDVSA 

The T7 protein sequence is: 

FACKTANGTAIPIGGGSANVYVNLAPWNVGQNLWDLSTQIFCHNDYPETITDYVTLQRGSA 
SYPFPTTSETPRWYNSRTDKPWPVALYLTPVSSAGGVAIKAGSLIAVLILRQTNNYNSDDFQ 
CDVSARDVTVTLPDYRGSVPiPLTVYCAKSQNLGYYLSGTHADAGNSIFTNTASFSPAQGVG 
GAVGTSAVSLGLTANYARTGGQVTAGNVQSIIGVTFVYQ 
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WHAT IS CLAIMED IS: 

1 . A method of cleaving a fusion protein including a first component which compfises ai! 
or part of a Caulobacter S-layer protein including a Caulobacter C*terniinal secretion 

5 signal, and a second component heterologous to Caulobacter, the fusion protein 

containing at least one aspartate-proline dipeptide, wherein the method con^rises 
combining the fusion protein with an acid solution of a sij^ngth insufficient to 
solubilize the fusion protein for a time sufficient for cleavage of the fusion protein at 
said aspartate-proline dipeptide. 

10 

2. The method of claim 1 wherein a aspartate-proline dipeptide is situated between the 
first and second components or adjacent a junction between the first and second 
components. 

15 3. The method of claim 1 or 2, wherein the acid sohition has a pH of from about 1.3 to 
about 2.5. 

4. The method of claim 1 or 2, wherein the acid solution has a pH of about 1.65 to about 
235. 

20 

5. The method of any one of claims 1-4 wherein the method is carried out at a 
temperature in the range of about 30^ C. to about 50" C. 

6. The method of any one of claims 1-5. wherein the method further comprises 
25 separating products cleaved from the fusion protein. 

7. A method of preparing a DNA construct for expression of a fusion protein suitable for 
use in the method of claim 1, wherein the method comprises joining an upstream 
DNA segment including DNA heterologous to Caulobacter which encodes a protein 
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of interest, to a downstream DNA segment including DNA for a Caulobactcr C- 
lerminal secretion signal, wherein the downstream DNA segment docs not encode an 
aspartate-proline dipeptide, and wherein the upstream segment contains DNA 
encoding an aspartate-proline dipeptide at or near an end of said upstream segment to 
5 be joined to said downstream segment. 

8. A method of preparing a fusion protein, conqjrising: 

(1) expressing a DNA construct prepared as described in claim 7 in 
Caulobacter and, 

10 

(2) ; recovering said fusion protein secreted by the Caulobacter, 
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